-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[mlir][xegpu] Add definition of SliceAttr #150146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
/// drops the data in the specified dimension, and return the rest. e.g., | ||
/// for data = [32, 64, 8], dropPositions = [0, 2], it will return [64] | ||
template<typename T, typename U> | ||
static llvm::SmallVector<T> dropDims(llvm::ArrayRef<T> data, llvm::ArrayRef<U> dropPositions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : dropDims -> sliceDims? data->dims, dropPosition -> dropDims
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed it to static llvm::SmallVector<T> slice(llvm::ArrayRef<T> shape, llvm::ArrayRef<U> dims)
. is it good to you?
Value add = rewriter.createOrFold<index::AddOp>( | ||
loc, tdescOffsets[i], | ||
getValueOrCreateConstantIndexOp(rewriter, loc, oldOffsets[idx])); | ||
newOffsets.push_back(add); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: oldOffsets: wgTileOffsets
newOffsets: sgTileOffsets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion. Fixed.
}]; | ||
|
||
let assemblyFormat = "`<` struct(params) `>`"; | ||
let genVerifyDecl = 1; | ||
} | ||
|
||
|
||
def XeGPU_SliceAttr : XeGPUAttr<"Slice", "slice", [LayoutTrait]> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is desirable to allow nested slice attribute to match the staged reduction use case, where reduction may follow another reduction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the support for nested SliceAttr has been enabled. @Jianhui-Li
mlir/test/Dialect/XeGPU/layout.mlir
Outdated
@@ -50,4 +50,21 @@ gpu.func @convert_layout_wg(%a: vector<32x64xf16>) { | |||
gpu.return | |||
} | |||
|
|||
gpu.func @slice_attr_repeat_dim() { | |||
//CHECK: arith.constant {layout_result_0 = #xegpu.slice<<sg_layout = [16, 1, 1], sg_data = [1, 8, 2]>, dims = [2]>} dense<8> : vector<16x8xindex> | |||
%cst = arith.constant {layout_result_0 = #xegpu.slice<<sg_layout = [16, 1, 1], sg_data = [1, 8, 2]>, dims = [2]>} dense<8> : vector<16x8xindex> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it be #xegpu.slice<<#xegpu.layout<sg_layout = [16, 1, 1], sg_data = [1, 8, 2]>>, dims = [2]>}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the print format issue. it is fixed.
@@ -7,6 +7,7 @@ add_mlir_dialect_library(MLIRXeGPUDialect | |||
|
|||
DEPENDS | |||
MLIRXeGPUIncGen | |||
MLIRXeGPUAttrInterfaceIncGen | |||
MLIRXeGPUAttrsIncGen | |||
MLIRXeGPUEnumsIncGen | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont you need to link MLIRIndexDialect ,MLIRAffineUtils here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Yes, they are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % comments.
"delinearizeSubgroupId", | ||
(ins "OpBuilder &": $builder, "Location":$loc, "Value":$linearId)>, | ||
InterfaceMethod<"Get the local offset to be accessed by the given subgroup Id", | ||
"FailureOr<SmallVector<SmallVector<Value>>>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the need for having a vector<vector<>> here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since each subgroup may be assigned with multiple blocks.
#r = #xegpu.slice<#l, dim = 0> | ||
|
||
%exp = math.exp %input {layout_result_0 = #l}: vector<256x128xf32> | ||
%red = vector.multi_reduction<add>, %exp, %acc [0] {layout_result_0 = #r}: vector<256x128xf32> to vector<128xf32> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to add a comment here explaning the output layout of %red
.
here the output sg layout is still [8, 4]. however data is shared along dim 0. So effectively there are 8 slices of [1, 4] SG segments each owning [1, 32] data. For example SGs [0-7][0] owns the same 1x32 data segement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated it
/// drops/slices the shape in the specified dims, and return the rest. e.g., | ||
/// for shape = [32, 64, 8], dims = [0, 2], it will return [64] | ||
template<typename T, typename U> | ||
static llvm::SmallVector<T> slice(llvm::ArrayRef<T> shape, llvm::ArrayRef<U> dims) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like this unrelated to XeGPUDialect. any reason for placing here? can it be moved to Utils?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving it to Utils will incur recursive dependence for linker.
@@ -33,6 +37,54 @@ void XeGPUDialect::initialize() { | |||
>(); | |||
} | |||
|
|||
// generate offsets computing instructions for a subgroup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Use triple line comments for def/impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
if (!isWgLayout()) | ||
return failure(); | ||
|
||
// TODO: handle order attribute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe assert unavailability of Order attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
} | ||
|
||
FailureOr<SmallVector<SmallVector<Value>>> | ||
LayoutAttr::getOffsets(OpBuilder &builder, Location loc, Value linearId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to add a comment on what the purpose of this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
// check every element in dims is unique and smaller than rank | ||
llvm::SmallDenseSet<int64_t> seen; | ||
for (int64_t dim : dims.asArrayRef()) { | ||
if (dim >= rank) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we check if dim >= 0
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. Fixed.
#define DBGS() (llvm::dbgs() << "[" DEBUG_TYPE "]: ") | ||
#define LDBG(X) LLVM_DEBUG(DBGS() << X << "\n") | ||
|
||
class TestStepOpPattern : public OpConversionPattern<vector::StepOp> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment on what this pattern is doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fixed.
Co-authored-by: Charitha Saumya <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
mlir_tablegen(XeGPUAttrInterface.h.inc -gen-attr-interface-decls) | ||
mlir_tablegen(XeGPUAttrInterface.cpp.inc -gen-attr-interface-defs) | ||
add_public_tablegen_target(MLIRXeGPUAttrInterfaceIncGen) | ||
add_dependencies(mlir-headers MLIRXeGPUAttrInterfaceIncGen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
auto newCreateNdOp = xegpu::CreateNdDescOp::create( | ||
rewriter, loc, newTdescTy, op.getSource(), globalOffsets, | ||
SmallVector<OpFoldResult> wgTileOffsets = op.getMixedOffsets(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe better to not use tile and just use wgOffsets and sgOffsets as var names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
}]; | ||
|
||
let parameters = (ins | ||
"xegpu::LayoutTrait": $parent, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is parent a trait and not LayoutAttr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is to support using both LayoutAttr and SliceAttr as parent. The later one is nested definition.
static SmallVector<SmallVector<Value>> | ||
genOffsetsComputations(OpBuilder &builder, Location loc, | ||
SmallVector<Value> sgId, ArrayRef<int64_t> sgLayout, | ||
ArrayRef<int64_t> sgShape, ArrayRef<int64_t> shape) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider generalizing sgShape to sgDataStride.
From the caller size, need to set as following :
sgDataStride[i] = sgData[i] if (wgData[i] % (sgData[i]*sgLayout[i])) == 0
sgDataStride[i] = 0 if wgData[i] == sgData[i]
if (auto maybeSgShape = getSgDataAsInt()) | ||
sgShape = maybeSgShape.value(); | ||
else if (auto ratio = computeShapeRatio(shape, sgLayout)) | ||
sgShape = ratio.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shape == ratio?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the variable name to derivedShape for clarification per discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you change the last place, but missed this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
//CHECK: [[C128:%.+]] = arith.constant 128 : index | ||
//CHECK: [[X:%.+]] = index.remu [[UX]], [[C128]] | ||
//CHECK: [[TDESC:%.+]] = xegpu.create_nd_tdesc [[ARG_0]][[[Y]], [[X]]] : memref<256x128xf32> -> !xegpu.tensor_desc<32x32xf32, #xegpu.layout<lane_layout = [1, 16], lane_data = [1, 1]>> | ||
%tdesc = xegpu.create_nd_tdesc %src[0, 0] : memref<256x128xf32> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider adding a test:
%tdesc = xegpu.create_nd_tdesc %src[0, 0] : memref<256x32xf32>
-> !xegpu.tensor_desc<256x32xf32, #xegpu.layout<sg_layout = [8, 4], sg_data = [32, 32], lane_layout = [1, 16], lane_data = [1, 1]>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, added one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
#1104) Brings Slice attribute def (llvm/llvm-project#150146)
This PR introduces the definition of
SliceAttr
, which complements the existing LayoutAttr by providing a mechanism to describe data distribution for operations such asmulti_reduction
andbroadcast
. For potential convenience of lowering pass, A common interface (LayoutTrait) is created forLayoutAttr
andSliceAttr
to provide an util to compute offsets for a given subgroup (id), and workgroup-level problem size (shape). theWgToSgCreateNdOp
is updated to test the implementation of such interface forLayoutAttr
, and a test pass--test-xegpu-layout-interface
is introduced to test its implementation forSliceAttr
.Additional use cases and lowering support for
SliceAttr
will be addressed in subsequent PRs.